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Abstract. We show that every regular language defines a unique non- 
deterministic finite automaton (NFA), which we call "atomaton", whose 
states are the "atoms" of the language, that is, non-empty intersections 
of complemented or uncomplemented left quotients of the language. We 
describe methods of constructing the atomaton, and prove that it is iso- 
morphic to the normal automaton of Sengoku, and to an automaton of 
Matz and Potthoff. We study "atomic" NFA's in which the right lan- 
guage of every state is a union of atoms. We generalize Brzozowski's 
double-reversal method for minimizing a deterministic finite automaton 
(DFA), showing that the result of applying the subset construction to an 
NFA is a minimal DFA if and only if the reverse of the NFA is atomic. 



1 Introduction 

Nondeterministic finite automata (NFA's), introduced by Rabin and Scott [10] in 
1959, play a major role in the theory of automata. For many purposes it is neces- 
sary to convert an NFA to a deterministic finite automaton (DFA). In particular, 
for each NFA there exists a minimal DFA, unique up to isomorphism. This DFA 
is uniquely defined by every regular language, and uses the left quotients of the 
language as states. As well, it is possible to associate an NFA with each DFA, 
and this is the subject of the present paper. Our NFA is also uniquely defined 
by every regular language, and uses non-empty intersections of complemented 
and uncomplemented quotients — the "atoms" of the language — as states. 

It appears that the NFA most often associated with a regular language is 
the universal automaton, sometimes appearing under different names. A recent 
substantial survey by Lombardy and Sakarovitch [8] on the subject of the uni- 
versal automaton contains its history and a detailed discussion of its properties. 
We refer the reader to that paper, and mention only that research related to the 
universal automaton goes back to the 1970's: e.g., in [4] as reported in [1], [5, 7]. 
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We call our NFA the "atomaton" 3 because it is based on the atoms of a 
regular language; we add the accent to minimize the possible confusion between 
"automaton" and "atomaton" . Automata isomorphic to our atomaton have pre- 
viously appeared in 1992 in the little-known master's thesis [11] of Sengoku, and 
in the 1995 paper [9] by Matz and Potthoff. 

We introduce "atomic" automata, in which the right language of any state 
is a union of some atoms. This class of automata is a generalization of residual 
automata [6] in which the right language of any state is a left quotient (which 
we prove to be a union of atoms), and includes also atomata (where the right 
language of any state is an atom), DFA's, and universal automata. 

Finally, we characterize the class of NEA's for which the subset construction 
yields a minimal DFA. More specifically, we show that the subset construction 
applied to an NFA produces a minimal DFA if and only if the reverse automaton 
of that NFA is atomic. This is a generalization of Brzozowski's method for DFA 
minimization by double reversal [2]. 

Section 2 recalls properties of regular languages, finite automata, and sys- 
tems of language equations. Atoms of a regular language and the atomaton are 
introduced and studied in Section 3. In Section 4, we examine NFA's in which 
the right language of every state is a union of atoms. Brzozowski's method of 
DFA minimization is extended in Section 5, and Section 6 closes the paper. 

2 Languages, Automata, and Equations 

If S is a non-empty finite alphabet, then S* is the free monoid generated by S. 
A word is any element of S* , and the empty word is e. The length of a word w 
is |io|. A language over £ is any subset of E* . 

The following operations are defined on languages over E: complement 
(L = E* \ L), union (K U L), intersection (K n L), product, usually called 
concatenation or catenation, (KL = {w E E* \ w = uv,u E K, v E L}), pos- 
itive closure (L + = (J i ^ 1 L t ), and star (L* = (J^qL 1 ). The reverse w R of a 
word w E E* is defined as follows: e R = e, and (wa) R = aw R . The reverse of a 
language L is denoted by L R and defined as L R = {w R | w E L}. 

A nondeterministic finite automaton is a quintuple Af = {Q, S, S, I, F), where 
Q is a finite, non-empty set of states, S is a finite non-empty alphabet, 5 : QxS — > 
2*5 is the transition function, I C Q is the set of initial states, and F C Q is 
the set of final states. As usual, we extend the transition function to functions 
6' : Q x S* -> 2<2, and 5" :2«xZ"^2^.We do not distinguish these functions 
notationally, but use 8 for all three. The language accepted by an NFA Af is 
L{M) = {w e S* | 6(1, w)(~)F ^ 0}. Two NFA's are equivalent if they accept the 
same language. The left language of a state q of Af is Li_ q (Af) = {w E S* \ q € 
5(1, w)}, and the right language of q is L qt p(Af) = {w E S* \ 8(q,w) fl F ^ 0}. 
So the language accepted by Af is L I ^ F (N). A state is empty if its right language 
is empty. 

3 The word should be pronounced with the accent on the first a. 
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A deterministic finite automaton (DFA) is a quintuple V = (Q, E ,S,qo, F), 
where Q, E, and F are as in an NFA, 5 : Q x E — > Q is the transition function, 
and qo is the initial state. One can view a DFA as a special case of an NFA, 
where the set of initial states is {qo}, and the range of the transition function is 
restricted to singletons {q}, q G Q. 

An incomplete deterministic finite automaton (ID FA) is a quintuple I = 
(Q,E,5,qo,F), where S is a partial function such that either 6(q,a) = p for 
some peQor S(q, a) is undefined. Every DFA is also an ID FA. 

Two states of an NFA are equivalent if their right languages are identical. 
An IDFA is minimal if no two of its states are equivalent. 

We use the following operations on automata: 

1. The determinization operation B> applied to an NFA JV yields a DFA Af v 
obtained by the well-known subset construction, where only subsets (including 
0) reachable from the initial subset of A/" D are used. 

2. The trimming operation T applied to an NFA J\f accepting a non-empty 
language deletes from N every state q not reachable from any initial state (q 
5(1, w) for any w G E*) and every state q that does not lead to any final state 
(5(q,w) n F — for all w G E*), along with the incident transitions. An NFA 
that has no such states is said to be trim. Note that, if J\f is trim, then so is 
Af R . If the trimming operation is applied to a DFA T>, we obtain the IDFA D T , 
which behaves like V, except that it does not have any empty states. 

3. The minimization operation M applied to an IDFA (DFA) V yields the 
minimal IDFA (DFA) V M equivalent to V. 

4. The reversal operation M applied to an NFA J\f yields an NFA A/" R , where 
initial and final states of N have been interchanged in Af M and all the transitions 
between states have been reversed. Note that Af RT = A/"™ for any NFA M. 

A trim IDFA I is bideterministic if also T R is an IDFA. A language is bide- 
terministic if its minimal IDFA is bideterministic. 

Example 1. Figure 1 (a) shows an NFA N. Its determinized DFA J\f° is in 
Fig. 1 (b), where parentheses around sets are omitted. The minimal equiva- 
lent V = Af aM of Af D is in Fig. 1 (c), where the equivalent states {2}, {1,3}, 
and {2,3} are represented by {1,3}. The reversed and trimmed version X> RT of 
the DFA V of Fig. 1 (c) is in Fig. 1 (d). 

The left quotient, or simply quotient, of a language L by a word w is the 
language w^ 1 L — {x G E* wx G L}. Left quotients are also known as right 
residuals. Dually, the right quotient of a language L by a word w is the language 
Luj- 1 — {x G E* xw G L}. Evidently, if N is an NFA and x is in L Is (M), 
then L qtF (J\f) C x~ 1 (L(Af)). 

The quotient DFA of a regular language L is T> = (Q, E, 6, qo, F), where 
Q = {w^L | w G E*}, 5(w- 1 L,a) = a - 1 ( W - 1 L), q = e~ x L = L, and F = 
{w^ 1 L | e G w _1 i}. The quotient IDFA of a regular language L is V T . 

The following definitions are from [8]: If L C E* , a subfactorization of L 
is a pair (X, Y) of languages over E such that XY C L. A factorization of L 
is a subfactorisation (X, Y) such that, ii X Q X' , Y <Z Y' , and X'Y' C L for 
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Fig. 1. (a) An NFA N; (b) A*; (c) A/™; (d) A" 1 



any pair (X',Y'), then X = X' and Y = Y'. The universal automaton of L is 
Ml = (Q, 2J, 5, I, F) where Q is the set of all factorizations of L, I = {(X, Y) e 
Q | £ e X}, F = {(X, Y) e Q | X C L}, and (A', V) e 5((X, Y), a) if and only 
if Xa C A'. 

For any language L let L e = if e ^ L, and L e = {e} otherwise. Also, let 
n ^ 1 and let [n] = {1, . . . ,n}. A nondeterministic system of equations (NSE) 
with n variables L\, . . . , L n is a set of language equations 

£i= |J o( |J Li)UL? i = l,...,n, (1) 

where Ji jfl C [n], together with a set {Li | i £ I}, where I C [n]. The equations 
are assumed to have been simplified by the rules a0 = and K U0 = 0U K = 
K, for any language K. Let L i a = UjeJ ^j! t nen -^i,o = a l Li is the left 
quotient of Li by a. The language defined by an NSE is L = \_} ieI Li. 

Each NSE defines a unique NFA Af and wee versa. States of N correspond to 
the variables Li, there is a transition Li — > Lj in M if and only if j G Ji, a , the set 
of initial states of M is {Li \ i £ 7}, and the set of final states is {Li | L\ — {e}}. 

If each Li is a left quotient (that is, a right residual) of the language L = 
{J ieI Li, then the NSE and the corresponding NFA are called residual [6]. 

A deterministic system of equations (DSE) with n variables is a set of lan- 
guage equations (called derivative equations when arising from regular expres- 
sions [3]) 

L, = (J aL la ULf i = l,...,n, (2) 

where Li a is one of the n variables and the initial set is {Li}. In DSE's we retain 
the empty language if it appears. 

Each DSE defines a unique DFA V and vice versa. Each state of V cor- 
responds to a variable Li, there is a transition Li A Lj in V if and only if 
L ia = Lj, the initial state of V corresponds to L\, and the set of final states is 
{Li | Lf = {e}}. In the special case when V is minimal, its DSE constitutes its 
quotient equations, where every Li is a quotient of the initial language L\. 

To simplify the notation, we write e instead of {s} in equations. 
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Example 2. For the NFA of Fig. 1 (a), we have the NSE 



Li = bL 2 , 

L 2 = aL 1 Ub(L 2 UL 3 )L)e, 
L 3 — aLi U bL 3 U e, 

with the initial set {L u L 3 }. The DSE for the DFA of Fig. 1 (b) obtained from 
this NSE is shown below on the left. Renaming the variables to correspond more 
closely to the subset construction, we get the equations on the right. 

Li U L 3 = aL\ U b(L 2 U L 3 ) U e, ^{1,3} = a£{i} U &£{ 2 , 3 } U e, 

Li=a0U6L 2 , L^y — aL® U bL{ 2 y, 

L 2 U L 3 = a^i U fc(L 2 U L 3 ) U e, £{ 2 , 3 } = U oi {2;3 } U e, 

L 2 = aLi U 6(L 2 U L 3 ) U e, L {2 } = aL {1 y U &£{ 2 , 3 } U e, 

= a0U&0. L = aL U6L . 

Noting that Lr 13 i , L/2,3}, and L{ 2 } are equivalent, we get the quotient equations 
for the DFA of Fig. 1 (c), where L^y = a~ 1 L^ 13 y, £{i. 3 } = b~ 1 L^ 13 y, etc. 

L {1,3} = aL {l} U &£{1,3} U £, 

L{!j = aL U &£{i,3}, 
L = ai U &L . 



3 The Atomaton of a Regular Language 

From now on we consider only non-empty regular languages. Let L be a regular 
language, and let L\ = L, L 2 , . . . , L n be its quotients. An atom of L is any non- 
empty language of the form A = L\ n L 2 n • • ■ fl £ n , where is either Li or L i; 
and at least one of the Li is not complemented (in other words, L\ f\L 2 H- • -nl„ 
is noi an atom). A language has at most 2 n — 1 atoms. 

An atom is initial if it has Li (rather than Li) as a term; it is final if and 
only if it contains e. Since L is non-empty, it has jit least one quotient containing 
s. Hence it Jias exactly one final atom, the atom L\C\L\C\- ■ -C\L n , where Li = Li 
if e e Li, Li — Li otherwise. The atoms of L will be denoted by A\,. . . , A m . 
Furthermore, we define I to be the set of initial atoms, and let A m be the final 
atom by convention. 

Proposition 1. The following properties hold for atoms: 

1. Atoms are pairwise disjoint, that is, Ai C\ Aj = % for all i,j G [m], i ^ j. 

2. The quotient w~ 1 L of L by w £ S* is a (possibly empty) union of atoms. 

3. The quotient w^ 1 Ai of Ai by w E S* is a (possibly empty) union of atoms. 

Proof. 1. If Ai ^ Aj, then there exists h € [n] such that Lh is a term of Ai and 
Lh is a term of Aj or vice versa. Hence Ai n Aj =0. 

2. The empty quotient, if present, is the empty union of atoms. If Li is a quo- 
tient of L and Li ^ 0, then Li is the union of all the 2™ _1 intersections that have 
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Li as a term. This includes all the atoms that have Lj as a term, and possibly 
some empty intersections. 

3. Consider the quotient equations of L. The quotient of each atom Ai by a 
letter a G E is an intersection X of complemented and uncomplemented quo- 
tients of L. If a quotient Lj of L does not appear as a term in X , then we "add 
it in" by using the fact that X = X n (Lj U L~) = (X n Lj) U(XnIJ). After 
all the missing quotients are so added, we obtain a union of atoms. Note that 
the intersection having all quotients complemented does not appear in this con- 
struction. It follows that w~ 1 A i is a union of atoms of L for every w G S* . □ 

Lemma 1. Letw,x G E* . Ifwx G Aj and a; G A, thenwAj C Ai,fori,j G [m]. 

Proof. Assume that iih G Aj and x G A,, but suppose wy Aj for some 
y G Aj. Then x G w> _1 Aj and y £ w~ 1 A i . By Proposition 1, Part 3, w~ 1 A i is 
a union of atoms. So, on the one hand, x G w~ 1 A i and a; G Aj together imply 
that Aj C tu Aj. On the other hand, from y ^ w Ai and y G Aj, we get 
Aj % w^ 1 Ai. Thus, the supposition wy £ Ai leads to a contradiction. Hence, 
wAj C Aj. □ 

In the following definition we use a 1-1 correspondence A, o Aj between 
atoms Aj of a language L and the states Aj of the NFA A defined below. 

Definition 1. Let L = L\ C S* be any regular language with the set of atoms 
Q = {Ai, . . . , A m } 7 initial set of atoms I C Q, and final atom A m . The atomaton 
of L is the NFA A = (Q, E, 5, 1, {A m }), where Q = {A, | A, G Q}, I = {Aj 
Aj G /}, and Aj G <5(Aj, a) if and only if aAj C Aj, for all Ai, Aj G Q. 

Example 3. Let L be defined by the quotient equations below on the left and 
accepted by the quotient DFA of Fig. 2 (a) . We find the atoms using the quotient 
equations. To simplify the notation, we denote Lj n Lj by Ljj, Lj n Lj by 
etc. Noting that L 1 2 3 is empty, we have the equations on the right, from which 
we get Fig. 2 (b) for the atomaton of L. 

Li = aL 2 U 6Li, Li 23 = a(L 123 U L T23 ) U b(L 123 U L 12 g), 

L2 = aL 3 U bLi U e, L- 23 = aLj^ 3 , 
L 3 = aL 3 U bL 2 - -^123 = ^123' 

%23 = ^(%23 u ^123)' 

^123 = a (-^123 U ^T23)' 
^123 = £ - 

Lemma 2. For w G L 1 *, Aj G <5(Aj, w) if and only if wAj C Aj, /or i, j G [m]. 

Proof The proof is by induction on the length of w. If \w\ = and Aj G <5(Aj, e), 
then i = j and sAj C Aj. If \w\ = and eAj C Aj, then i = j, since atoms are 
disjoint; hence Aj G <5(Aj,e). If \w\ — 1, then the lemma holds by Definition 1. 

Now, let w = av, where a G S and v G S + , and assume that lemma holds 
for v. Suppose that Aj G S(Ai,av). Then there exists some state A^ such that 
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Fig. 2. (a) Quotient automaton; (b) atomaton of L. 



Afc G o"(Aj, a) and Aj £ o~(Afc, w). Thus, aA^ C Aj by the dchnition of atomaton, 
and vAj C Ak by the induction assumption, implying that avAj C Aj. 

Conversely, let avAj C Aj. Then vAj C a _1 Aj. Let x G Aj. Then vx £ 
a~ 1 A i . Since by Proposition 1, Part 3, a~ 1 A i is a union of atoms, there exists 
some atom A^ such that ra e 4. Since x £ A,-, by Lemma 1 we get vA, C Ak- 
Furthermore, because avAj C Ai and x G Aj, we have avx £ A- Since vx e Ak, 
then aAfe C A by Lemma 1. 

As the lemma holds for v and a, vAj C implies Aj G o~(Afc,v), and 
aAk C implies A& G o~(Aj,a), showing that Aj G S(Ai,av). □ 

Proposition 2. The right language of state Ai of atomaton A is the atom Ai, 
that is, L A . { Am -}(A) = Ai, for all i G [m\. 

Proof. Let w G L A ._^ Am -j(A); then A m G S(Ai,w). By Lemma 2, we have 
wA m C Aj. Since e G A m , we have w G Aj. 

Now suppose that w G Aj. Then we G Aj, and since e G A m , by Lemma 1 
we get wA m C Aj. By Lemma 2, A m G <5(Aj,w), that is, w G tA,^™}^)- n 

Theorem 1. The language accepted by the atomaton A of L is L, that is, 
L(A) = L IdAm} = L. 

Proof. We have L(A) = LL,e/ ^A 4 ,{A m }(A) = U Aje/ Aj, by Proposition 2. 
Since / is the set of all atoms that have L = L\ as a term, we also have L = 

Proposition 3. TTie Ze/f language of state Ai of atomaton A of a language L 
is Li :Ai (A) = ((x^) -1 L R ) R , for every i G [m] and every word x in Ai. 

Proof. If w G Li :Ai {A), then Aj G S(Ai ,w) for some Aj G I. Then wAi C Aj 
by Lemma 2. Since Aj C L, we also have wAi C i, that is, wx G £ for every 
x G Aj. Then x R w R G and w 71 G {x R )- x L R . Thus, w G ((a^)" 1 /^)*. 

Now, let w G ((x^)- 1 ^)^, where x G Aj. Then w R G (x^) -1 ^, im- 
plying that wx G £. By Theorem 1, there is some Aj G / such that wx G 
LAi .{A m }(A). By Proposition 2, wx G Aj . Since x G Aj, by Lemma 1 we have 
wAj C Aj . By Lemma 2, Aj G S(Ai ,w), implying that w G L/.a^A). □ 

Proposition 4. TTie fe/£ language of state Ai of atomaton A is non-empty, that 
is, £/,a,(A) 7^ 0, /or every i 6 [m]. 



Proof. Suppose that Li iAi (A) = for some i E [m]. Then by Proposition 3, 
((x R y 1 L R ) R = for any x E A { . Then also {x R )~ r L R = 0, implying that for 
any w E S* 1 wx £ L. However, since there is some quotient Lj of L, j E [n], 
such that Ai C Lj , and there is an x in A i} we have x E Lj. Let v E S* be such 
that Lj = v~ 1 L. Then we get vx E L, which is a contradiction. □ 

Corollary 1. The dtomaton of any regular language is trim. 

Next we recall a theorem from [2], slightly modified to apply to IDFA's : 

Theorem 2. If X is an IDFA in which every state is reachable from the initial 
state andl accepts L, thenl m is the minimal DFA of L R , that is, l mM = l Ra . 

We have defined a unique NFA, the atomaton, directly from the quotient 
equations of a language L, that is, from the minimal DFA recognizing L. In con- 
trast to this, Sengoku [11] defined a unique NFA starting from any NFA accepting 
L: The normal automaton of L is the NFA A™™. Matz and Potthoff [9] (p. 
78) defined an NFA £ as the reverse of the trim minimal DFA accepting L R , 
that is £ — B™, where B is the minimal DFA accepting L R . We now relate a 
number of concepts associated with regular languages: 

Theorem 3. Let L be any regular language, and let A be its dtomaton. 

1. The reverse A R of A is an IDFA. 

2. A R is minimal. 

3. The determinization A a of A is the minimal DFA of L. 

4. The normal NFA A/™ M ™ of any NFA accepting L is isomorphic to A. 

5. Matz and Potthoff 's NFA £ is isomorphic to A. 

6. A is isomorphic to the quotient IDFA of L if and only if L is bideterministic. 

Proof. Suppose L has the quotients L\, . . . , L n and atoms A\, . . . , A m . 

1. Since A has one accepting state, A R has one initial state. Because atoms 
are disjoint, a word w can belong to at most one atom. If w belongs to Ai, 
then, by Proposition 2, w E L A . ^ Am j(A) and w £ L>x jt {A m }{A) if j / i. Hence 
w R E L {Am} , Az {A R ) and w R <£ L {Am} , Aj (A R ) if j + ^Thus _4 R is an IDFA. 

2. Since A is trim, so is A R . Thus, if A R is not minimal, there must be states 
Aj.Aj e Q, A, ^ Aj, such that L Ai;I (^ R ) = L AjJ (A R ). Let L k = u~ x L be 
any non-empty quotient of L, where k E [n] and u E £*. Then there are two 
possibilities: either u E Lj :Ai (A), or u £ Li, Ai (A). 

In the first case u R E L Ai j(A R ), and, since L Ai j(A R ) = L Aj j(A R ), we have 
u R E L A]J (A R ), implying that u E Lj tAj (A). Thus, L A .^ Am] (A) C L k and 
L>A-,{A m }(A) C L k . In view of Proposition 2, A t and Aj are both subsets of L k . 

Now, assume that u ^ Li, Ai (A). Then, as in the first case, we get u $ 
Li y Aj(A). If L A .^ Am j(A) C L kl then ux E L for some x E L A .^ Am y(A). But 
then x R u R E L R . Since x R E i{A m },Ai {A R ) and A R is deterministic, we must 
have u R E L Ai j(A R ). This contradicts the assumption that u $ Li^ Ai (A). Thus 
L A^{A m \{A) % L k , and similarly, L A ^ {Am} (A) % L k , implying that neither Ai 
nor Aj is a subset of L k . 
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So, for every fee [n], either both atoms Ai and Aj are subsets of Lk or neither 
of them is. Since Ai and Aj are distinct, there must be an h such that A, C Lf t 
and C L^. This contradicts our earlier conclusion. 

3. By Theorem 2 applied to IDFA A R , the automaton A RRD = A is minimal. 

4. Let M be any NFA accepting L. The DFA A/" RDM is the unique minimal 
DFA accepting the language L R . By Parts 1 and 2, A R is a minimal IDFA, 
and it accepts L R . Since A/™ MT is isomorphic to A R , it follows that the normal 
automaton A™ MTR i s isomorphic to A. 

5. Since B is isomorphic to J\f RDM of Part 4, the claim follows. 

6. Let V be the quotient DFA of L, and suppose that A is isomorphic to T> T . 
By Part 1, A fl is an IDFA. Since A is isomorphic to X> T , A itself is an IDFA. 
Hence A, and so also L, are bideterministic. 

Conversely, let B be a trim bideterministic IDFA accepting L. Then B R is 
an IDFA satisfying the condition of Theorem 2. Thus (£ R ) RDM = (£»)»» = B a_ 
Since B is an IDFA, we have B m = B; hence B is isomorphic to the quotient 
IDFA of L. 

Since B is an IDFA satisfying the condition of Theorem 2, B RD is the minimal 
IDFA of L R , that is, B RDM = B m . Because B R is deterministic, B Rm = B R . Thus 
B R0UT = B Rm = B R . By Part 4, B mM ™ B is the atomaton of /.. Hence B is 
both the minimal IDFA of L and its atomaton. □ 

As noted in [9], for each word w in L there is a unique path in A accepting 
w, and deleting any transition from A results in a smaller accepted language. It 
is also stated in [9] without proof that the right language L q ^F{N) of any state 
q of an NFA accepting L is a subset of a union of atoms. This holds because 
L q ,f(N) is a subset of a (left) quotient of L, and quotients are unions of atoms 
by Proposition 1, Part 2. 

Theorem 3 provides another method of finding the atomaton of L: simply 
trim the quotient DFA of L R and reverse it. In view of this we have 

Corollary 2. If L = L R and L is accepted by quotient DFA V, then A = V™. 
In particular, if L is a unary language, then A = T> R . 

Example 4- Let L = (b U ba)*; then L R = (b U ab)* , and it is accepted by the 
minimal DFA V of Fig. 1 (c). Its trimmed reverse is shown in Fig. 1 (d). Hence 
the NFA of Fig. 1 (d) is the atomaton of L. 

4 Atomic Automata 

We now introduce a new class of NFA's and study their properties. 

Definition 2. An NFA J\f = (Q, E,S, I, F) is atomic if for every state q E Q, 
the right language L q ^F{N) of q is a union of some atoms of L(J\f). 

Note that, if Lq^iM) = 0, then it is the union of zero atoms. 

Recall that an NFA N is residual, if Lq^i-M) is a (left) quotient of L{N) 
for every q e Q. Since every quotient is a union of atoms (see Proposition 1, 
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Part 2), every residual NFA is atomic. However, the converse is not true: there 
exist atomic NFA's which are not residual. For example, the atomaton of Fig. 2 is 
atomic, but not residual. Note also that every DFA is a special case of a residual 
NFA; hence every DFA is atomic. 

Let us now consider the universal automaton Ul = (Q,S,S,I,F) of a lan- 
guage L. We recall some basic properties of this automaton from [8]. Let (X, Y) 
be any factorization of L. Then 

(1) Y = fUx x-'L and X = (\ yeY Ly~\ 

(2) L It(XiY) {U L ) = X and L (X , Y) . F {U L ) = Y. 

(3) The universal automaton Ul accepts L. 

To recapitulate what was said above about residual automata and DFAs, and 
also to show that the universal automaton is atomic, we have 

Theorem 4. Let L be any regular language. The following automata accepting 
L are atomic: 1. The atomaton A, 2. Any DFA, 3. Any residual NFA, 4- The 
universal automaton Ul ■ 

Proof. 1. The right language of every state of A is an atom of L, so A is atomic. 

2. The right language of every state of any DFA accepting L is a quotient of L. 
Since every quotient is a union of atoms, every DFA is atomic. 

3. The right language of every state of any residual NFA of L is a quotient 
of L, and hence a union of atoms. Thus, any residual NFA is atomic. 

4. By (1) and (2) above, L( X ,y),f{Ml) — f) xeX x~ x L. Let Li,...,L n be the 
quotients of L. Then L( X<y y f (Ul) = HieH for some H C [n]. Now f] ieH L % = 

(riieH_^)n(n je H\ff(^ui7) = u(fw L i) n (r\je[n]\H ^o. where l j is eithcr 

Lj or Lj. Thus the right language of state (X,Y) of Ul is a union of atoms of 
L. By (3) above, L(Ul) = L; hence Ul is atomic. □ 

5 Extension of Brzozowski's Theorem on Minimal DFA's 

Theorem 2 forms the basis for Brzozowski's "double-reversal" minimization al- 
gorithm [2]: Given any DFA (or IDFA) T>, reverse it to get T> R , determinize 
V R to get V m , reverse V RO to get £> RDR , and then determinize V m to get 
•pRBRB This lagt D pA is guaranteed to be minimal by Theorem 2, since V RD is 
deterministic. Hence 2? KDKD is the minimal DFA equivalent to V. 

Since this conceptually very simple algorithm carries out two detcrminiza- 
tions, its complexity is exponential in the number of states of the original au- 
tomaton in the worst case. However, its performance is good in practice, often 
better than Hopcrofts's algorithm [12,13]. Furthermore, this algorithm applied 
to an NFA still yields an equivalent minimal DFA; see [13], for example. 

We now generalize Theorem 2: 

Theorem 5. For a trim NFA N ' , Af° is minimal if and only if N R is atomic. 

Proof. Let N = (Q, E, 5, L, F) be any trim NFA, and let J\f R = (Q, S, S R , F, L) 
be its reverse. Let the atoms of Af R be B\, . . . , B r , and let B be the atomaton 
of L(M R ). 
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Assume first that Af v is minimal. Let q be a state of Af, and hence of A/" R ; 
since Af is trim, so is Af R , and there are w, x G S* such that x G Lp yq (Af R ) and 
w G L qJ (Af R ). Since L q j(Af M ) C x~ 1 L(Af R ), and every quotient of L(7V fl ) is a 
union of atoms, there is some i G [r] such that w £ Bi. 

Suppose that Af M is not atomic; then there must be a state q' G Q which is 
not a union of atoms. This means that there is some i G [r] and words u,v G B, 
such that u G L q >j(Af R ) but w L q >j(Af R ). Suppose that 2 is in L Ftql (Af R ). 
Since w G z~ 1 L(Af WL ), u G Sj, and z~ 1 L(Af R ) is a union of atoms, we must have 
Bi C z _1 L(A/' R ). But now »eB, implies that G L(Af R ). Hence there must 
be a state q" G Q, q" 7^ q', such that v G L q nj(Af R ). Therefore, we know that 
u R G L IiQ >(N), v R g L Iiq '(Af), and w fl G L Itq »(Af). 

Now, since every state of A" D is a subset of the state set Q of A", there is 
a state s' of A/" D such that q' G s' and u R G Li yS >(Af D ), and there is a state s" 
of A" D such that g" G s" and u fl G L Fs „(Af a )'. Since w fl £ L If g>(M), we have 
^ s", implying that s' ^ s". 

By Theorem 3, Part 4, (A/" R ) RDMTR = A" DMTR is isomorphic to the atomaton 
B. By the assumption that Af D is minimal, A" DTR is isomorphic to B. Thus 
L s > /(A/ 10 ™) - B fc and L S » / (A' DTR ) = B ; for some k,l G [r]. Since u R G 
Li', s >(Af°), we have u G L sV (A" DR ) = L slJ (M om ) = L sl j(M mR ) = B k . This 
together with u G Bi, yields k = i. Similarly, G S "(A/" D ) and i; G Bi, implies 
/ = i. Thus, L s i j(Af°™) = L s »j(Af a ™). But then ' L^tf") = L /;S „(AP T ), 
which contradicts the inequality s' ^ s". Therefore Af R is atomic. 

To prove the converse, assume that Af R is atomic; then, for every state q of 
Af R , there is a set H q C [r] such that L q j(Af R ) — Uigif This implies that 
Li, q {N) = [J ieHq B R for every state q of Af. 

Let A/" D = (S, S, 7, 7, G), and suppose that A" D is not minimal. Then there 
are at least two states s' and s" of Af°, s' # s", with L s ,, G (Af°) = L s n tG (Af a ). 
Let r> TO = (Q m , S, 5 m , I, F m ) be a minimal DFA equivalent to A" D . Then there 
must be a state s oiV m such that L SiFm (V m ) = L s ,, G (Af a ) = L s „, G (Af a ). Then 
L Fs ,(Af a ) C L ItS {V m ) and L hs „(Af n ) C L hs iV. m ) must also hold! 

Since Af™* is isomorphic to T> m , and A/" DM ™ is isomorphic to the atomaton 
B of L(A" R ) by Part 4 of Theorem 3, also (V m )™ is isomorphic to B. Thus 
i s /((£>m) TR ) = B t for some i G [r]. This implies that X/ s ((V m ) T ) = B R . Thus 
L ItS ,(Af°) C B R . 

On the other hand, the left language of state s' of Af v consists of all words 
u such that it G Lj q i (Af) for every q' G s', but u ^ Li q (Af) for any q s' . That 
is, L,,y(A^) = (n 9 , es , U ielv ) \ (U 90s < U eff , B R ). Since by Proposition 1, 
Part 1, B t n B, : = for all i, j G [r], i ^ j, then also Sf n Bf = 0, and the 
result of any boolean combination of sets B R where i G [r] , cannot be a proper 
subset of any B R . Therefore, L hs ,(N n ) C Bf cannot hold and thus, A" D must 
be minimal. □ 

Corollary 3. If Af is a non-minimal DFA, then Af R is not atomic. 

Theorem 5 can be rephrased as follows: A" is atomic if and only if A™ is 
minimal. Sengoku defines an NFA Af to be in standard form [11] if and only 
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if J\f m is minimal, and also shows that the right language of every state of an 
NFA in standard form is equal to the union of right languages of some states of 
the normal automaton (that is, our atomaton). 

6 Conclusions 

We have introduced a natural set of languages — the atoms — that are defined 
by every regular language. We then defined a unique NFA, the atomaton, and 
related it to other known concepts. We introduced atomic automata, and gen- 
eralized Brzozowski's method of minimization of DFA's by double reversal. 
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